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Biologists rely heavily on the language of information, coding, and transmission that is commonplace in the field 
of information theory as developed by Claude Shannon, but there is open debate about whether such language 
is anything more than facile metaphor. Philosophers of biology have argued that when biologists talk about 
information in genes and in evolution, they are not talking about the sort of information that Shannon's theory 
addresses. First, philosophers have suggested that Shannon theory is only useful for developing a shallow notion 
of correlation, the so-called "causal sense" of information. Second they typically argue that in genetics and 
evolutionary biology, information language is used in a "semantic sense," whereas semantics are deliberately 
omitted from Shannon theory. Neither critique is well-founded. Here we propose an alternative to the causal and 
semantic senses of information: a transmission sense of information, in which an object X conveys information if 
the function of X is to reduce, by virtue of its sequence properties, uncertainty on the part of an agent who observes 
X. The transmission sense not only captures much of what biologists intend when they talk about information in 
genes, but also brings Shannon's theory back to the fore. By taking the viewpoint of a communications engineer 
and focusing on the decision problem of how information is to be packaged for transport, this approach resolves 
several problems that have plagued the information concept in biology, and highlights a number of important 
features of the way that information is encoded, stored, and transmitted as genetic sequence. 



[. INTRODUCTION 

Biologists think in terms of information at every level of in- 
vestigation. Signaling pathways transduce information, cells 
process information, animal signals convey information. In- 
formation flows in ecosystems, information is encoded in the 
DNA, information is carried by nerve impulses. In some 
domains the utility of the information concept goes unchal- 
lenged: when a brain scientist says that nerves transmit infor- 
mation, nobody balks. But when geneticists or evolutionary 
biologists use information language in their day-to-day work, 
a few biologists and many philosophers become anxious about 
whether this language can be justified as anything more than 



facile metaphor (|Godfrey-Smith 2000b 2008 Griesemer 



2005; Griffiths 2001 Sterelny 2000 Sterelny and Griffiths 



1999[ ). Why do the neurobiologists get a free pass while evo- 
lutionary geneticists get called on the carpet? When neuro- 
biologists talk about information, they have two things go- 
ing for them. First, there is a straightforward analogy be- 
tween electrical impulses in neural systems and the classic 
communications theory picture of source, channel, and re- 
ceiver (S hannon||1948| l. Second, information theory has obvi- 
ous "legs" in neurobiology: for decades neurobiologists have 
profitably used the theoretical apparatus of information the- 
ory to understand their study systems. Geneticists are not so 
fortunate. For them, the analogy to communication theory 
is less obvious. Efforts to make this analogy explicit seem 
forced at best and the most successful uses of information- 
theoretic reasoning within the field of genetics rarely make 
explicit their information-theoretic foundations or make use 
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of information-theoretic language ( |Crick et al\ |1957 
[sternl[T97T]|Freeland and Hurst| [19981 |Kimura|[196T 



Felsen- 



As a consequence, philosophers have concluded that the 
mathematical theory of communication pioneered by Claude 
Shannon in 1948 (hereafter "Shannon theory") is inadequate 
to ground the notion of information in genetics and evolution- 
ary biology. First, philosophers have unfairly suggested that 
Shannon theory is only useful for developing a shallow no- 
tion of correlation, the so-called "causal sense" of informa- 
tion. Second, they typically argue that in genetics and evolu- 
tionary biology, information language is used in a "semantic 
sense" — and of course semantics are deliberately omitted 
from Shannon theory. 

Neither critique is well-founded. In this paper we begin by 
summarizing the causal and semantic views of information. 
We then propose an alternative — a transmission sense of in- 
formation — that not only captures much of what biologists 
intend when they talk about information in genes, but also 
brings Shannon's theory back to the fore. 



Causal view of information 



Inspired by Dretsk e (|Dretske|^ 1983|l, several authors 



(Godfrey-Smith 2008; Griffiths 2001; Griffiths and Gray 
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|1994| |Sterelny and Griffiths| |1999 1 have explored Shannon 
theory as a grounding for information language in biology. 
They derive roughly the following picture: The key currency 
in information theory is entropy H(X) and the key statistic 
is mutual information I(X;Y), where X and Y are random 
variables. Information is conveyed in Grice's sense of nat- 
ural meaning ( |Grice| |1957| l: whenever Y is correlated with 
X, we can say that Y carries information about X. There is 
no deep notion of meaning or coding here. "[W]hen a biol- 
ogist introduces information in this sense to a description of 
gene action or other processes, she is not introducing some 
new and special kind of relation or property", Godfrey Smith 
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writes, "She is just adopting a particular quantitative frame- 
work for describing ordinary correlations." (Godfrey-Smith 
2008). In this causal sense of information, genes carry in- 
formation about phenotypes just as smoke carries information 
about fire, nothing more. If this is all that biologists are doing 
when they talk about how the genotype carries information 
about phenotype, this is a shallow use of the information con- 
cept! 

Not only is this sense shallow, it fails to capture the direc- 
tional flow of information from genotype to phenotype that is 
the central dogma of molecular biology ( |Crick| |1970[ ). If by 
"G has information about P" we mean only that I(G;P) > 0, 
then we are not acknowleding the direction of information 
from genotype to phenotype ( |Godfrey-Srmth"l |2000b| [2008] 
Griffiths 2001 ). The reason is that the mutual information / is 
by definition a symmetric quantity; I(G;P) — I(P;G). Math- 
ematically, the amount of information that knowing the geno- 
type G provides about the phenotype P is always exactly equal 
to the amount of information that knowing the phenotype P 
provides about the genotype G (see Fig.[TJi. 

I ( G '<P) Causal information / ( - P ; G ) 
Semantic information 



Genome 



Phenotype 



Environment 



Figure 1 Information theory restricted to a descriptive statistics for 
correlations. Information flows in both directions between genotype 
and phenotype, I(P; G) = I(G;P), and, according to the parity thesis, 
there is nothing that privileges genes over environment. 

Worse still, this notion of mutual information fails to cap- 
ture the sense of privilege that biologists tend to ascribe to the 
informational molecule DNA over other contributors to phe- 
notype. So far as causal covariance is concerned, both genes 
G and environment E influence phenotype P — and in prin- 
ciple we can equally well compute either I(G;P) and I(E;P). 
So it seems that Shannon theory has no way of singling out 
DNA as an information-bearing entity. 

This criticism is formalized as the parity thesis, and is 
crafted around an important result in information theory that 
the roles of source and channel conditions are exchangeable 
(Griffiths a nd Gray] |1994[ ). Typically when one sits in front 
of the television, the football broadcast is the signal. The 
weather, a crow landing on the television antenna, interference 
from a neighbor's microwave — these are sources of noise, 
the channel conditions for our transmission. But a television 
repairman has an opposite view. He doesn't want to watch 
the game, he wants to learn about what is altering the trans- 
mission from the station. So he tunes your set to a station 
broadcasting a test pattern. For the repairman, this test pattern 
provides channel conditions to read off the signal of how the 
transmission is being altered. As Sterelny and Griffiths point 
out (1999), "The sender/channel distinction is a fact about our 
interests, not a fact about the physical world." The parity the- 
sis applies this logic to genes and environment. In the parity 



view, whether it is genome or environment that carries infor- 
mation must be a fact about our interests, not a fact about the 
world. 

The problem with these arguments is that they adopt a few 
tools from Shannon theory, but neglect its raison d'etre: the 
underlying decision problem of how to package information 
for transport. Before delving deeper into Shannon theory, we 
will take a brief detour to summarize the semantic sense of 
information in biology. 



Semantic view of information 

In addition to the limitations enumerated above, the causal 
view fails to highlight the intentional, representational nature 



of genes and other biological objects (|Godfrey-Smith 2008 



Griffit hs and Gray| [1994) [Sheal [20071 |Sterelny and Gr iffiths 



1999 1. When biologists talk about genes as informational 



molecules, this argument goes, it is not because they are cor- 
related with other things (e.g. amino acid sequence or phe- 
notype), but rather because they represent other things. This 
semantic sense of information in which "genes semantically 
specify their normal products" ( jGodfrey- Smith 2007 1 cannot 
be captured using Shannon theory, which is by design silent 
on semantic matters. 

But what is it that genes are supposed to represent? Much of 
the conventional language of molecular biology suggests phe- 
notypes as an obvious candidate, and this is the approach that 
Maynard Smith takes in a target article that triggered much 
of the recent debate over the information concept in biology 
(Maynard Smith, 2000). At first glance, this view has several 
things to recommend it. A semantic notion of genetic infor- 
mation captures the directionality discussed above: genes rep- 
resent phenotypes but phenotypes do not represent genes. One 
could also try to argue that the semantic view privileges genes 
in that we can say that genes have a representational message 
about phenotype but environment does not. Finally, it allows 
for misrepresentation or false repre sentation, whereas causal 
information does not ( Griffiths 200 1 1 > 1 . 

Does this mean that the problem is solved? No — |Grif-| 
|ftfhs| ( |2001| > and |Godfrey-Smith|p008) argue that semantic in- 
formation remains vulnerable to the parity thesis. Moreover, 
Godfrey-Smith (1999,2008) and Griffiths (2001) note that the 
reach of the semantic information concept within genetics is 
very shallow: legitimate talk of semantic representation can 
go no further than genes coding for amino acids. Beyond 
this point, the mapping from genotype forward is context- 
dependent and hopelessly entangled in a mass of causal inter- 
actions. Thus, these authors conclude, the relation from genes 
to phenotype cannot be a representational one. Accordingly, it 
seems as if the semantic view of information has been pushed 
as far as it will go, and yet we are left without a fully sat- 
isfactory account of the information concept in biology. Let 



1 We think that Stegmann handles the misrepresentation issue even more 
cleanly with a shallow semantic notion of genes as conveying instructional, 
as opposed to representational, content jStegmann 2004| 
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us therefore return to Shannon's information theory, but move 
beyond the causal sense. 



II. A TRANSMISSION VIEW OF INFORMATION 

As we described above, philosophers of biology largely re- 
strict Shannon theory to a descriptive statistics for correla- 
tions. This misses the point. At the core of Shannon theory is 
the study of how far mathematical objects such as sequences 
and functions can be compressed without losing their identity, 
and if compressed further, how much their structure will be 
distorted. From this foundation in the limits of compression 
emerges a richly practical theory of coding: information the- 
ory is a decision theory of how to package information for 
transport, efficiently. It is a theory about the structure of those 
sequences that efficiently transmit information. And it is a the- 
ory about the fundamental limits with which that information 
can be transmitted (|Cover and Thomas |2006{ |Pierce| |1980| 
|Shannon and Weaver||1949[|Yeung||2002| >. 

This decision-theoretic view of Shannon theory is missing 
from the discussions of information in biology. We will argue 
that this view provides a justification for information language 
as applied to genes, and that it also resolves the apparent and 
unappealing symmetries of (1) mutual information between 
genes and phenotype and (2) the parity thesis. 

In the original formulation of Shannon theory, information 
is what an agent packages for transmission through a channel 
to reduce uncertainty on the part of a receiver. This informa- 
tion is physically instantiated and spatiotemporally bounded. 
Thus, as Lloyd and Penfield (2003) note, information can be 
sent either from one place to another, or from one time to 
another 2 . Usually when we talk about sending information 
from one place to another, we posit two separate actors, one 
of whom sends a message that the other receives; when we 
talk about sending information from one time to another, we 
posit a single agent who stores information that she herself 
can later retrieve. But whether the message goes across space 
or time, whether there are one or two agents involved, whether 
we use the language of signal transmission or the language of 
data storage, mathematically these are exactly the same pro- 
cess. Thus in practice, when you package information and 
then send it either across the space dimension as a signal or 
across the time dimension via storage and retrieval, you are 
transmitting information. 

Think about what happens when you send a message to 
your friend by burning a compact disc. Your computer en- 
codes a message onto the digital medium. You send the 
medium through a channel (e.g. the postal service). Your 
friend, the receiver, puts the CD into her computer, the com- 
puter decodes the message, and she hears the sweet strains 
of Rick Astley. But it doesn't matter that you sent the disc 
through the mail - all of the mathematical operations that un- 



derly the information encoding are the same whether you send 
the CD to a friend or save it for your own later use. To cross 
space or time, we can encode the same way. Indeed, we use 
the same error-correcting codes for storage and retrieval on 
CDs as we do for sending digital images from deep space back 
to Earth ( ppral[T993] l. 

Taking this view of information and transmission, let us re- 
turn to the proposed schematic of biological information in 
Figure 1. This picture has neither a space dimension or a time 
dimension; information is not being sent anywhere. Here we 
simply have a correlation (if one takes a causal view) or a 
translation (if one takes a semantic view). Thus within the 
actual Shannon framework, Figure 1 simply illustrates a de- 
coder. Likewise, notice that the biological processes underly- 
ing the schematic in Figure 1 are not the processes that biol- 
ogists refer to when they talk about transmission. In biology, 
transmission genetics is the study of inheritance systems, not 
the study of transcription and translation, and genetic trans- 
mission is the passing of genes from one generation to another, 
not the passing of information from genotype to phenotype. 
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Figure 2 In biology, genetic transmission occurs vertically (from 
parent to offspring to grandoff spring). It is upon this axis that the 
transmission sense of information focuses. 

In life and in evolution the transmission of information goes 
from generation to generation to generation as in Figure [2] 
Here is the transmission; we know that genes are transmitted 
from parent to offspring in order to provide the offspring with 
information about how to make a living (e.g. metabolize sug- 
ars, create cell walls, etc.) in the world. This suggests that 
can make sense of a large fraction of the use of information 
language in biology if we adopt a transmission view of infor- 
mation^ . 



Transmission view of information 

An object X conveys information if the function 
of X is to reduce, by virtue of its sequence prop- 
erties, uncertainty on the part of an agent who ob- 
serves X. 



2 In practice, it takes time to send information from one place to another, but 
the conventional Shannon framework suppresses this time dimension. 



3 A message need not be composed of multiple characters to meet this defini- 
tion. Even a string of length one is a sequence, thus even a single character 
conveys information. 
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As with many aspects of science, the tools and language 
that we use have a strong influence on the questions that we 
think to ask — and once we shift to the transmission sense 
of information, our focus changes. When we view biologi- 
cal information as a semantic relationship, we are drawn to 
think like developmental biologists, about how information 
goes from an encoded form in the genotype to its expression 
in the phenotype. But when we talk about the transmission 
sense of information, we step out in an orthogonal direction 
and we can now see information as it flows through the pro- 
cess of intergenerational genetic transmission (Figure El. And 
once we do that, we can start to think about natural selection, 
the evolutionary process, and how information gets into the 
genome in the first place. 



Symmetry of mutual information 

By viewing Shannon information as a result of a decision 
process instead of as a correlation measure, we can resolve 
the concern that Godfrey-Smith raises about the bidirectional 
flow of information in Shannon theory. Godfrey-Smith dis- 
misses the causal sense of information because "information 
in the Shannon sense 'flows' in both directions, as it involves 
no more than learning about the state of one variable by at- 
tending to another" ( |Godfrey-Smith| |2008| l. Here Godfrey- 
Smith is referring to the symmetry of the mutual information 
measure: I(X;Y) = I(Y;X) (Figure 1). 

What is the mutual information actually measuring when 
we apply this equality in a communication context? An ex- 
ample helps. Peter and Paul are concerned about the state of 
the world W. Suppose that Peter observes a correlate X of the 
random variable W. We want him to communicate his obser- 
vation to Paul, and he does so using a signal Y. The mutual in- 
formation I(X;Y) tells us how effectively he conveys what he 
sees, on average, given the statistical distribution of possible 
X values, the properties of the channel across which the signal 
is sent, etc. Specifically, I(X;Y) = H{X)~H{X\Y) measures 
how much Paul learns by knowing Y about what Peter saw, 
X, again on average. Because I(X;Y) = H(X)-H(X\Y) = 
H(X)+H(Y)-H(X,Y) = H(Y)-H(Y\X) =I(Y;X), Peter 
knows exactly as much about what Paul learns as Paul learns 
about what Peter saw. But I(Y; X) is usually irrelevant when 
we think about the decision problem of communicating. In 
this context we want Peter to get a message about the world 
to Paul, and we rarely care how much Peter knows afterwards 
about what Paul has learned. 

This directionality is manifested within Shannon theory by 
the data processing inequality (Figure[3]) ( |Cover and Th omas 
2006 Yeung 2002 ). This theorem states that the act of pro- 
cessing data, deterministically or stochastically, can never 
generate additional information. A corollary pertains to com- 
munication: along a communication chain, information about 
the original source can be lost but never gained. In the sce- 
nario described above, both the observation step W — ► X and 
the communication step X — ► Y are steps in a Markov chain. 
For any Markov chain W —* X — > Y, the data processing in- 
equality states that I(W;X) > I(W;Y), In our example, the 
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Figure 3 Despite the symmetry of the mutual information I(X;Y) = 
I(Y;X), the data processing inequality reveals the directional flow of 
information in Shannon 's scheme. 



data processing inequality reveals that communication be- 
tween Peter and Paul is not symmetric. Paul may know as 
much about what Peter sent as Peter knows about what Paul 
received, but Paul does not in general know as much about 
what matters — the state of the world — as Peter does. Shan- 
non theory is not symmetric with respect to the direction of 
communication. 



The Parity thesis 

In the introduction to this paper, we described the parity 
thesis. While there is a good case to be made for parity be- 
tween genes and environment when we restrict our view to 
the horizontal development of phenotype from genotype, that 
parity is shattered when we look along the vertical axis of in- 
tergenerational transmission. 

Look at Figure 2, which corresponds to a neo-Darwinian 
view of evolution. In this model of the biological world, the 
transmission concept cleanly separates genes from environ- 
ments. The former are transmitted across generations, the 
latter are not. Moreover, taking a teleofunctional view as 
|Sterelny et q/.| ([T996 ) do for replicators in general, the hypoth- 
esis that genes are for transmission across generations is richly 
supported by the physical structure of the DNA. Genes are 
made out of DNA, a molecule that is exquisitely fashioned so 
as to (1) encode lots of sequence information in small space, 
(2) be incredibly easy to replicate, (3) be arbitrarily and in- 
finitely extensible in what it can say, and (4) be structurally 
very stable and inert ( |Lewontin| |1992| l. In fact, DNA is per- 
haps the most impressive known substance with respect to (1) 
and (2). No machine can look at a protein and run off a copy; 
DNA is exquisitely adapted so that a relatively simple ma- 
chine, the DNA polymerase, does this at great speed and high 
fidelity. Think about how amazing it is that PCR works. It 
is as if you could throw a hard drive in a water bath with a 
few enzymes and a few raw materials, run the temperature 
through a few cycles, and pull out millions of identical hard 
drives. DNA practically screams "I am for storage and trans- 
mission!" 

One might object that Figure 2 conveys an over-simplified 
view of the world. This is true. A more sophisticated view 
of the evolutionary process allows for additional channels of 
intergenerational transmission and information flow: environ- 
ments can be constructed and inherited (Odling-Smee et al. 



5 



2003). Non-genetic biological structures such as membranes 
and centrosomes are inherited (Griffiths and Knight| |1998 



98) 
ler. 



and can even be argued to carry some information (Griesemer 
2005). Methylation provides an extensive layer of markup 
on top of nucleic acid sequence. Developmental switches 
actively transduce environmental information into epigeneti- 
cally heritable forms ( [Grirrithsl[200T] ). 

But such an objection misses our point. Our aim with the 
transmission sense of information is not to single out uniquely 
the genes as having some special property which we deny to 
all other biological structures, but rather to identify those com- 
ponents of biological systems that have information storage, 
transmission, and retrieval as their primary function. Methy- 
lation tags are obvious members of this information-bearing 
class: they carry information across generations in the trans- 
mission sense and this appears to be their primary function. 
Extrinsic features of the environment such as ambient temper- 
ature are obviously not members of this class: they are not 
transmitted across generations, they carry information only in 
the causal sense and information transmission is not their role 
under any reasonable teleofunctional explanation. Biological 
structures such as membranes and centrosomes may appear as 
some sort of middle ground, but we note that (1) their primary 
function is not an informational one and (2) their bandwidth 
is extremely restricted compared to that of DNA sequence. 
Birds' nests ( |Sterelny et q/.| [T996) could be seen as an envi- 
ronmental analogue to these intracellular structures, whereas 
libraries start to push toward genes and methylation tags in 
their informational capacity. Developmental switches ( |Grif-| 
fiths| |200T[ ) are another interesting case; these transduce envi- 



ronmental information but in addition to their bandwidth lim- 
itations they appear to have a more limited intergenerational 
transmission role. Genes may not be unique in their ability to 
convey information across generations — but at the same time 
a transmission view makes it clear that not all components of 



the developmental matrix (Griffiths 2001 Griffiths and Gray 



1994) enjoy parity in their informational capacities. 

The parity thesis typically is linked to Shannon's informa- 
tion theory via the claim that "The source/channel distinc- 
tion is imposed on a natural causal system by the observer." 
(Griffiths 2001, p. 398) What is signal, and what is noise — 
Sterelny and Griffiths (1999]) take this to be merely a reflec- 



tion of our interests. So must we impose our own notions of 
what makes an appropriate reference frame in order to single 
out certain components of the developmental matrix as signal 
and others as noise? No. We are not the ones who pick the 
reference frame, natural selection does. Because natural se- 
lection operates on heritable variation, it acts upon some com- 
ponents of the developmental matrix differently than others. 
For a biologist trying to understand life, the source-channel 
distinction is imposed not by the observer but rather by the 
process of natural selection from which life arose and diversi- 
fied. 

To better understand the role of natural selection it helps 
to expand Figure [2] somewhat. (For simplicity we retain our 
focus on the genes as transmitted elements, but one could ex- 
tend this to include other heritable structures). In Figure[2] we 
highlight the fact that it is the genes, and not the environment, 
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Figure 4 Transmission and natural selection. With the parent send- 
ing variant messages to each offspring and natural selection acting 
on the phenotype, information can accumulate in the genome. 



which is transmitted from generation to generation. In Figure 
|4]we highlight the fact that not all genes are transmitted to the 
next generation. It is by the means of variation in the genes 
and selection on the phenotypes with which they are corre- 
lated that information can built up in the genome over time 
( |Felsensteml|1971| l. 



Causal information versus transmitted information 

One motivation for replacing causal sense views of infor- 
mation with semantic-sense views is that the causal sense of 
information appears to cast too broad of a net. Any physical 
system with correlations among its components carries causal 
information — but in their use of the information concept, bi- 
ologists appear to mean something stronger than the notion of 
natural meaning that has smoke in the sky carrying informa- 
tion about a fire below (Godfrey-Smith 2008). If we substi- 



tute a transmission view of information for a semantic view, 
will we be driven back to this overly-broad notion of infor- 
mation? Not at all. Like naturalized views of semantics, the 
transmission notion of information rests upon function: to say 
that X carries information, we require that the function of X 
be to reduce uncertainty on the part of a receiver. 

The failure to consider function when talking about infor- 
mation sometimes generates confusion among practicing bi- 
ologists. After all, there are correlations everywhere in bi- 
ological systems, measuring them is what we do as biolo- 
gists, and we often talk about these correlations as informa- 
tion. This language is understandable; indeed these correla- 
tions provide us with information about biological systems. 



But this is merely causal sense information (jGodfrey- Smith 



1999). As Godfrey-Smith explains, when a systematist uses 
gene sequences to make inferences about population history, 
"there is no more to this kind of information than correlation 
or 'natural meaning' ; the genes are not trying to tell us about 
their past." (Godfrey-Smith 1999 1. In other words, these cor- 
relations do not convey teleosemantic information. Nor can 
these correlations be considered information in the transmis- 
sion sense. 
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To expand upon this distinction, an example from popula- 
tion genetics is helpful. Voight and colleagues (Voig hTef al.\ 
|2006[ ) developed a method for inferring positive selection at 
polymorphic loci in the human genome. Their key insight is 
that with enough sequence data from sufficiently many mem- 
bers of the population, we can pick out regions of the genome 
that have unusually long haplotypes of low diversity. Such ex- 
tended haplotype blocks tend to surround an allele that has re- 
cently risen in frequency due to strong selection, because there 
has not been enough time for recombination to break down the 
association between the favored allele and the genetic back- 
ground in which it arose. Using this method, Voight and 
colleagues find strong evidence for recent selection among 
Europeans in the lactase gene LCT, which is important for 
metabolism of lactose beyond early childhood. The favored 
allele results from a single nucleotide change 14 kb upstream 
of the lactase gene on the nearly 1 Mb haplotype. Positive se- 
lection has presumably occurred because the ability to process 
lactase throughout life became advantageous with the inven- 
tion of animal agriculture approximately 10,000 years ago. 



What does this have to do with information? There is infor- 
mation about the history of selection in the population-level 
correlations. Voight et al. found an extended haplotype length 
surrounding the LCT+ allele relative to that around the LCT 
allele. But notice that we have to observe the genotypes of 
multiple individuals in order to determine that one allele at 
the LCT locus is surrounded by longer haplotype blocks than 
is the other. Once we have made observations of multiple 
genomes, we as external observers can conclude something 
about the history of selection on the population. But this in- 
formation is not available at the level of a single individual. 
A single individual cannot look at its own genome and notice 
a longer (or shorter) haplotype block around any given focal 
locus — these haplotype blocks are defined with respect to 
the genotypes of others in the population. A single individ- 
ual can only look at its own genome and see a sequence of 
base pairs. This sequence of base pairs is what is transmit- 
ted, it is what has the function of reducing uncertainty on the 
part of the agent who observes it 4 . These individual gene se- 
quences are the entities that have an informational function 
in biology (though bioinformaticians have not always recog- 
nized this distinction ( |Adami| |2002| l). The population-level 
statistics that geneticists use to infer history are informative 
but they are not information in the transmission sense. They 
are merely the smoke that is cast off by the fire of natural se- 
lection. 



Coding without appeal to semantics 

The transmission sense of information allows us to sepa- 
rate claims about how information is transmitted from claims 
about what information means 5 . Indeed, we can study how in- 
formation is transmitted without having any knowledge of the 
"codebook" for how to interpret the message, or even what the 
information represents. 

In many biological studies, we are in exactly this position. 
Again an example — this time drawn from neurobiology — 
is helpful. In a study of the fly visual system, de Ruyter 
van Steveninck and colleagues presented flies with a moving 
grating as a visual stimulus, and made single-cell recordings 
of the spike train from the HI visual neuron (de Ru yter van] 
|Steveninck et al. \ |1997| l. This neuron is sensitive to move- 
ment, but it is not known how movement information is en- 
coded into the spike train, nor even what aspects of movement 
are being represented. Nonetheless de Ruyter van Steveninck 
and colleagues were able to determine how much informa- 
tion this neuron is able to encode. The investigators exposed 
a fly to the stimulus, and measured the (average) entropy of 
the spike train. This is the so-called total entropy for the neu- 
ron's output. They then looked at what happens if you play 
the same stimulus back repeatedly: how much does the re- 
sultant spike train vary from previous trials? This is the so- 
called noise entropy. The information that the spike train car- 
ries about the stimulus, i.e., the mutual information between 
spike train and stimulus, is simply the difference of these two 
quantities. Using this approach, the authors were able to show 
that this single insect neuron conveys approximately 80 bits 
of information per second about a dynamic stimulus. Thus an 
individual visual neuron achieves a bit-rate that is roughly 7 
times the bit rate of a skilled touch typist 6 ! More importantly, 
the researchers were able to compare the response of this neu- 
ron to static stimuli with the response of the neuron to natural 
patterns of motion. They found that for natural patterns the 
neuron is able to attain the high bandwidth that it does by "es- 
tablishing precise temporal relations between individual ac- 
tion potentials and events in the sensory stimulus". By doing 
so, the neuron's response to the dynamic stimulus greatly sur- 
passes the bit rate that could be obtained if the stimulus was 
encoded by a simple matching of spike rate to stimulus inten- 
sity. Subsequent investigators have used related methods to 
show that evolved sensory systems are tuned to natural stim- 



4 Although causal-sense information is transmitted from the population at 
time t to the population at time t+1 in the population frequencies of hap- 
lotypes, this is not transmission-sense information because the function of 
these population-level haplotype assemblages is not to reduce uncertainty 
on the part of future populations. 



5 The source-channel separation theorem (Cover and Thomas 2006, Chapter 
7, p.218) proves that in any physical communication system for error-free 
transmission over a noisy channel, one can entirely decouple the process 
of tuning the code to the nature of the specific channel from not only the 
semantic reference of the signal but indeed from all statistics of the mes- 
sage source. This follows because the theorem states that one can achieve 
channel capacity with separate source and channel coders — and in this 
setup the source coder can always be configured so as to return output that 
maximizes the entropy given the symbol set. 

6 Using Shannon's 1950 upper bound on bits per letter and his estimate of 
letters per word in the English language I Shannon 
mate the bit rate of a touch typist as l20 ™ oras 1 ml 
1 1 .7 bits/second. 



19501, we can esti- 

ite 4.5 letters 1.3 bits _ 



minute 60 seconds word 
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uli, to study the properties of neural adaptation and history 
dependence, and to examine temporal sensitivity — all with- 
out knowing the way in which the signals that they study are 
actually encoded. 

de Ruyter van Steveninck et al. sum up the power of being 
able to study information without appeal to semantics: "This 
characterization of ... information transmission is independent 
of any assumptions about [or knowledge of!] which features 
of the stimulus are being encoded or about which features of 
the spike train are most important in the code" (de Ruyter van 
ISteveninck efaL\ [1997) 

This brings us back to Shannon theory. When information 
theorists think about coding, they are not thinking about se- 
mantic properties. All of the semantic properties are stuffed 
into the codebook, the interface between source structure and 
channel structure, which to information theorists is as interest- 
ing as a phonebook is to sociologists. When an information 
theorist says "Tell me how data stream A codes for message 
set B", she is not asking you to read her the codebook. She 
is asking you to tell her about compression, channel capac- 
ity, distortion structure, redundancy, and so forth. That is, she 
wants to know how the structure of the code reflects the statis- 
tical properties of the data source and the channel, with respect 
to the decision problem of effectively packaging information 
for transport. 

With these things in focus, we can now look at the concept 
of arbitrariness, what it means, and why this concept is criti- 
cally important in biological coding. 



III. INFORMATION THEORY AND ARBITRARINESS 

In arguing that DNA is an informational molecule, May- 
nard Smith (Maynard Smith, , |2000|l app eals to Jacques 
Monod's concept of gratuite ( Monod||1971[ ), and a number of 
additional authors have further explored this thread (Godfrey^ 
|Smifh| |2000a|b| |Stegmann| [2005] l. For Monod, gratuite was 
an important component of the logical structure of his the- 
ory of gene regulation. Gratuite is the notion that in principle 
regulatory proteins can cause any inducer or repressor to in- 
fluence the expression of any region of DNA. There need be 
no direct chemical relation between the structure of an inducer 
and the nucleic acid sequence on which it operates. Maynard 
Smith observes that we can see something like gratuite in the 
structure of the genetic code as well: there is an arbitrary as- 
sociation between codons and the amino acids that they spec- 
ify. 

Yet as they grapple with this idea of an arbitrary code, these 
authors confront the fact that the genetic code is not a ran- 
dom assignment of codons to amino acids, but rather a one- 
in-a-million evolved schema for associating these molecules: 
the genetic code is structured so as to smooth the muta- 
tional landscape and ensure that common translational errors 
generate amino acid replacements between chemically simi- 



or contingent on random historical processes as in Crick's 
frozen accident hypothesis. Rather, they are making a semi- 
otic claim. Arbitrariness refers to the fact that "[m]olecular 
symbols in biology are symbolic," as opposed to indexical or 



lar amino acids (Freeland and Hurst 1998, Haig and Hurst 



|1991[ ). So what can these authors mean when they say that the 
code is arbitrary? Maynard Smith, Godfrey-Smith, and others 
are not suggesting that the structure of the code is random 



iconic (Maynard Smith 2000). In the case of the genetic code, 
this means that the association between a codon and its cor- 
responding amino acid is not driven by the immediate steric 
interactions between the codon and the amino acid, but instead 
is mediated by an extensive tRNA structure that in principle 
could have coupled this codon to any other amino acid instead. 

This fact is enormously important to the function of the bi- 
ological code — not as a matter of the semiotic classifications 
that fascinated Charles Pierce, but rather to solve the sort of 
decision problem that motivated Claude Shannon. From the 
symbolic relation between code and product, there arise the 
degrees of freedom that a communication engineer requires to 
tune the code to the statistical properties of source and chan- 
nel. To see how this works we will visit an example from the 
early history of telecommunications. 

For over one hundred years, Morse code was the standard 
protocol for telegraph and radio communication. The code 
transcribes the English alphabet into codewords composed of 
short pulses called dots and long pulses called dashes. For 
example, the letter E is represented by a single dot " . ", the 
letter T is represented by a single dash the letter Q by the 

quartet " — . -", and the letter J by " . ". At first glance, 

the mapping between letters and Morse codewords appears to 
be arbitrary. They are certainly symbolic rather than iconic 
or indexical. But there is an important pattern to the way that 
letters are assigned codes in Morse code. 

But instead of assigning codewords sequentially, " . " to A, 
"-" to B, " . . " to C, Samuel Morse exploited the degrees of 
freedom available for codeword assignments to make an effi- 
cent code for fast transmission of English sentences. He could 
not assign short code words to every single letter — there sim- 
ply are not enough short code words to go around. Instead, by 
assigning the shortest code words to the most commonly used 
letters, Morse created a code in which transmissions would 
on average be shorter than if he had used sequential codeword 
assignments. 

Morse could not have done this with pictograms. The lee- 
way to associate any message with any code word provides 
the communications engineer with the degrees of freedom that 
he needs to tune the semantic and statistical properties of the 
source messages to the transmission cost and error properties 
of the channel. In the absence of a full picture of the engi- 
neer's decision problem, the code might look arbitrary. But 
a well-chosen code is not arbitrary at all; it solves a decision 
problem for packaging. 

Morse exploited the available degrees of freedom in his 
choice of codes; apparently natural selection has done the 
same in evolving a one-in-a-million genetic code. Thinking 
about these degrees of freedom — along with an important 
thought experiment — helped us to understand the role of cod- 
ing in the transmission sense of information. Godfrey-Smith 
( 2000b ) imagined a hypothetical world composed of "protein 
genes" as a way to explore the importance of the coding con- 
cept in biology. Since the idea of coding presumably refers 
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to the fact that DNA provides an arbitrary combinatorial rep- 
resentation of amino acid sequence, Godfrey-Smith considers 
the nature of a world in which the hereditary material was 
neither arbitrary nor representational. He images a world of 
"protein-genes" in which there is no translation but instead a 
copying system in which amino acid sequences, assisted by 
coupling molecules, replicate using previous amino acid se- 
quences as templates. In that world, Godfrey-Smith argues, 
there is nothing that corresponds to coding, and yet stepping 
away from the microscope, biology functions much as be- 
fore. In Godfrey-Smith's protein-sample world, there is no 
code, no compression, no redundancy, and information theory 
can merely be applied as a descriptive statistics for correla- 
tions. One could even go so far as to argue that in a protein- 
genes world, physical structure and not information is inher- 
ited across generations. Thus Godfrey-Smith concludes that 
"Removing genetic coding from the world need not change 
much else, and this gives support to my claim that we should 
only think of coding as part of an explanation of how cells 
achieve the specific task of putting amino acids in the right or- 
der," rather than something fundamental to the logical struc- 
ture of biology. 

But there is a critical difference between a world with a 
proper genetic code and world based upon protein-genes: only 
the former allows an arbitrary combinatorical mapping be- 
tween templates and products. From an information-theoretic 
perspective, this is absolutely critical. The degrees of freedom 
to construct an arbitrary mapping of this sort turn the problem 
of code evolution from one of passing physical samples to the 
next generation, into a decision-theoretic problem of how to 
package information for transport. In the protein-genes world, 
the fidelity of transmission is at the mercy of the biochemical 
technology for copying. There are no degrees of freedom for 
structuring redundancy, minimizing distortion, or conducting 
the other optimization activities of a communications engi- 
neer. In a DNA-based code, the chemically arbitrary assign- 
ments of nucleotide triplets to amino acids via tRNAs offer 
the degrees of freedom to do all of the above. While the pre- 
cise dynamics of code evolution remain unknown, it appears 
that natural selection has put these degrees of freedom to good 
use. 

The difference between a protein-genes world and a DNA- 
based world became clear by taking the perspective of a com- 
munications engineer. Throughout the paper this has been our 
approach. Correlations, symmetry of mutual information, the 
parity thesis, arbitrariness, coding — all these these come into 
focus from a communications viewpoint. We see what makes 
the genetic code a code, and we get a new perspective on the 
information language that is part of the everyday working vo- 
cabulary of researchers in genetics and evolutionary biology. 
The transmission sense of information justifies such language 
as more than shallow metaphor. 
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